Document-level Consistency Verification in Machine Translation

نویسندگان

  • Tong Xiao
  • Jingbo Zhu
  • Shujie Yao
  • Hao Zhang
چکیده

Translation consistency is an important issue in document-level translation. However, the consistency in Machine Translation (MT) output is generally overlooked in most MT systems due to the lack of the use of document contexts. To address this issue, we present a simple and effective approach that incorporates document contexts into an existing Statistical Machine Translation (SMT) system for document-level translation. Experimental results show that our approach effectively reduces the errors caused by inconsistent translations (25% error reduction). More interestingly, it is observed that as a “bonus” our approach is able to improve the BLEU score of the SMT system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Word Embeddings to Enforce Document-Level Lexical Consistency in Machine Translation

We integrate newmechanisms in a document-level machine translation decoder to improve the lexical consistency of document translations. First, we develop a document-level feature designed to score the lexical consistency of a translation. This feature, which applies towords that have been translated into different forms within the document, uses word embeddings to measure the adequacy of each w...

متن کامل

Document-Level Machine Translation Evaluation with Gist Consistency and Text Cohesion

Current Statistical Machine Translation (SMT) is significantly affected by Machine Translation (MT) evaluation metric. Nowadays the emergence of document-level MT research increases the demand for corresponding evaluation metric. This paper proposes two superior yet low-cost quantitative objective methods to enhance traditional MT metric by modeling document-level phenomena from the perspective...

متن کامل

Novel Document Level Features for Statistical Machine Translation

In this paper, we introduce document level features that capture necessary information to help MT system perform better word sense disambiguation in the translation process. We describe enhancements to a Maximum Entropy based translation model, utilizing long distance contextual features identified from the span of entire document and from both source and target sides, to improve the likelihood...

متن کامل

Document-level Re-ranking with Soft Lexical and Semantic Features for Statistical Machine Translation

We introduce two document-level features to polish baseline sentence-level translations generated by a state-of-the-art statistical machine translation (SMT) system. One feature uses the word-embedding technique to model the relation between a sentence and its context on the target side; the other feature is a crisp document-level token-type ratio of target-side translations for source-side wor...

متن کامل

Modeling Term Translation for Document-informed Machine Translation

Term translation is of great importance for statistical machine translation (SMT), especially document-informed SMT. In this paper, we investigate three issues of term translation in the context of documentinformed SMT and propose three corresponding models: (a) a term translation disambiguation model which selects desirable translations for terms in the source language with domain information,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011